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PRELIMINARY AMENDMENT 



Assistant Commissioner for Patents 
Washington, D.C. 20231 

Dear Sir: 

Prior to the first Office Action, please amend the above-identified 
application as follows: 



In the Specification: 

On page 1, line 3, at the beginning of the text insert: 
—This is a continuation of co-pending application serial number 
08/665,404, filed on June 18, 1996.-- 



In the Claims; 

Cancel claims 1 through 33. 



1 34. (Amended) A computer data storage medium storing a 

2 correspondence table which enables compression of a pronunciation 

3 dictionary, the correspondence table comprising; 

4 a plurality of correspondence sets, each correspondence set 

5 including 

6 a correspondence text entry; [and] 

7 a correspondence phoneme entry representing the 

8 pronunciation of the correspondence text entiy;[ ? ] and 

9 a correspondence symbol identifying the correspondence set. 



Add the following claims 35 through 53: 



1 35. The computer data storage medium of claim 34 further storing a 

2 tuning function for optimizing said correspondence table. 

1 36. The computer data storage medium of claim 35 wherein said 

2 tuning function eliminates redundant correspondence sets and low 

3 usage correspondence sets from said correspondence table. 
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1 37. The computer data storage medium of claim 34 wherein said 

2 correspondence table includes said correspondence sets for all practical 

3 combinations of said correspondence text entries and said 

4 correspondence phoneme entries for a given language. 

1 38. The computer data storage medium of claim 34 further storing: 

2 a grouping of a plurality of said correspondence sets. 

1 39. The computer data storage medium of claim 38 wherein said 

2 correspondence phoneme entries of said grouping are similar to one 

3 another in pronunciation. 

1 40. A system for storing a pronunciation guide comprising: 

2 a correspondence table for storing pronunciation data; and 

3 a tuning function for optimizing said correspondence table. 

1 41. The system of claim 40 wherein said correspondence table 

2 comprises at least one correspondence set. 

1 42. The system of claim 41 wherein said tuning function eliminates 

2 redundant correspondence sets from said correspondence table. 
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2 
3 



43. 



The system of claim 42 further comprising: 

a correspondence symbol corresponding to said text entry and to 
said phonetic entry for identifying said correspondence set. 



1 44. The system of claim 42 wherein said correspondence table includes 

2 said correspondence sets for all practical combinations of said 

3 correspondence text entries and said phonetic entries for a given 

4 language. 

1 45. The system of claim 42 further comprising: 

2 a grouping of a plurality of said correspondence sets. 

1 46. The system of claim 45 wherein said phonetic entries of said 

2 grouping are similar to one another in pronunciation. 

1 47. The system of claim 41 wherein said tuning function eliminates 

2 low usage correspondence sets from said correspondence table. 

1 48. The system of claim 41 wherein said at least one correspondence 

2 set comprises: 

3 a correspondence text entry; and 

4 a phonetic entry corresponding to said correspondence text entry. 
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1 49. The system of claim 48 wherein said phonetic entry is a phoneme, 

2 an allophone, or a syllable. 

1 50. A method of storing a pronunciation guide, comprising the steps 

2 of: 

3 inputting a correspondence set into a correspondence table; and 

4 inputting into said correspondence table a correspondence symbol 

5 corresponding to said correspondence set. 

1 51. The method of claim 50 further comprising the steps of: 

2 optimizing said correspondence table; and 

3 grouping a plurality of said correspondence sets. 

1 52. The method of claim 5 1 wherein said step of optimizing further 

2 comprises the steps of: 

3 eliminating redundant correspondence sets from said 

4 correspondence table; and 

5 adding productive correspondence sets to said correspondence 

6 table. 
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1 53. The method of claim 50 wherein said step of inputting a 

2 correspondence set further comprises the steps of: 

3 inputting a correspondence text entry into said correspondence 

4 table; and 

5 inputting a phonetic entry corresponding to said correspondence 

6 text entry into said correspondence table. 
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REMARKS 



The Office Action mailed February 20, 1998 (paper #5), in parent 
application 08/665,404 allowed claims 1-33. This preliminary amendment 
cancels claims 1-33, amends claim 34, and adds nineteen new claims 35-53. 
Claims 34-53 are now pending in this continuing Application. No new matter 
is being added. In view of the following remarks, Applicant respectfully 
requests reconsideration of the rejections and further examination of the 
continuing Application as amended. 

Rejection under 35 U.S.C. S 102(b) 

The Examiner rejected claim 34 under 35 U.S.C. § 102(b) as being 
anticipated by U.S. Patent No. 4,779,080 by Coughlin. et al. . Electronic 
Information Display Systems (hereinafter Coughlin) . 

In paragraph 4, page 3 of the Office Action, the Examiner stated that 
"Coughlin et al teach [es] a computer data storage medium storing a 
correspondence table ... the correspondence table comprising: a plurality of 
correspondence sets, each correspondence set including ... a correspondence 
phoneme entry representing the pronunciation of the correspondence text entry 
and a correspondence symbol identifying the correspondence set (his address 
dataQ]". Applicant respectfully traverses. As new claims 35-53 are similar in 
scope to claim 34, no new search should be required and the following 
arguments anticipate similar rejections to claims 35-53. 



Claim 34 (as amended) recites "a correspondence phoneme entry- 
representing the pronunciation of the correspondence text entry . . . This 
limitation is supported in the Specification on page 10, lines 25-26 as follows: 
"a phoneme entry expressing at least one phonetic sound." Further, the 
Specification reflects that "although the invention has been described using 
phonemes, other alternative means for representing pronunciation of text are 
possible . . ." (page 22, lines 8-10, emphasis added). 

In contrast, Coughlin teaches a dictionary "contained in [a] compact disc 
... for each of the text words that make up [a] dictionary. . . . For each text 
word, the set of data comprises ... a sound portion giving an audible 
pronunciation." (Col. 5 lines 22-26, emphasis added). In addition, Coughlin 
teaches an area "for selecting a sound pronunciation of the text word." (Col. 6, 
lines 40-41). Thus, Coughlin includes an actual audio pronunciation (or 
recording) of the text word when selected for playback by the system user. 
(Col. 5, lines 12-14). Coughlin does NOT teach a means for "representing the 
pronunciation of the correspondence text entry" as does the present invention. 
In addition, Coughlin does not disclose or teach the working of the sound 
portion. 

A phoneme, or phonetic entry, is a guide to pronunciation and not the 
actual audio recording of the text. A phoneme is an abstract representation 
used in a phonetic system of a language that corresponds to a similar speech 
sound which is perceived to be a single distinctive sound of a basic speech unit 
in the language. For example, in English, the text "au" may be represented by 
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the phoneme "OW." The phoneme represents the sound that is made by a 
speaker when pronouncing the text, not the actual sound of the text. Coughlin 
does not teach a "correspondence phoneme entry representing the 
pronunciation of the correspondence text" of the correspondence table; nor 
does Coughlin teach, in any manner, the use of phonemes. On the contrary, 
Coughlin , finding a pronunciation guide to be inadequate, teaches away from 
the use of a phonetic guide. (Table, Col. 5, lines 12-4). Coughlin further states 
that a "pronunciation guide is not usually very helpful to the typical reader." 
(Col. 4, lines 43-44). Therefore, Coughlin does not teach a pronunciation guide 
and, thus, does not anticipate the present invention. 

Furthermore, claim 34 (as amended) recites "a correspondence table . . . 
comprising: a plurality of correspondence sets, each correspondence set 
including: a correspondence text entry; a correspondence phoneme entry 
representing the pronunciation of the correspondence text entry; and a 
correspondence symbol identifying the correspondence set." This limitation is 
supported in the Specification on page 6, lines 11-15 as follows: 
"[correspondence table 240 lists phoneme entries and text entries .... A text 
entry, a phoneme entry and a symbol together form a correspondence set, and 
a plurality of correspondence sets forms correspondence table 240." In 
addition, the present invention teaches that: "[correspondence table 240 
preferably includes correspondence sets for most practical combinations of 
correspondence text and phonemes in a given language." (page 6, lines 18-20). 



Coughlin does NOT teach a correspondence table that includes the 
correspondence text and phoneme entries that "enable compression of a 
pronunciation dictionary/' Furthermore, Coughlin teaches away from the use 
of a correspondence table. (Col. 4, lines 41-45). Therefore, Coughlin does not 
teach a correspondence table of text entries and phoneme entries and, thus, 
does not anticipate the present invention. 

Claim 34 is also distinguishable over Coughlin where it recites "a 
correspondence table which enables compression of a pronunciation 
dictionary." This limitation is supported in the Specification on page 6, lines 
23-26 as follows: "a tuning function 260 facilitates eliminating the less useful 
correspondence sets from, and adding more useful correspondence sets to, 
correspondence table 240." Further, "tuning function 260 ... [is used] to 
compress at least a portion of pronunciation dictionary 210 . . .." (page 11, 
lines 4-6). In contrast, Coughlin does NOT teach or disclose, in any manner, 
the compression or optimization of a pronunciation guide. 

Finally, claim 34 recites "a correspondence symbol identifying the 
correspondence set." This limitation is supported in the Specification on page 
6, lines 1 1-17 as follows: "[e]ach correspondence set includes an identifier, 
referred to as a correspondence symbol, which may be simply the address of 
the set in correspondence table 240." The present invention uses the 
correspondence symbol to identify the phoneme for a given text entry, (page 7, 
lines 12-15). 
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Coughlin , in contrast, discloses "address data [which gives] the address 
where the set of data for each text word is located in the compact disc (CD 
ROM)." (Col. 5, lines 34-36). Thus, the address data is the address of the 
dictionary entry, not an identifier to a correspondence set entry as in the 
present invention. Therefore, Coughlin's address data does not anticipate the 
correspondence symbol of the present invention. 

In view of the above remarks, Applicant respectfully contends that the 
rejection of claim 34 (and, by extension, claims 35-53) on all of the grounds set 
forth is fully traversed and overcome, and should be withdrawn, and that the 
continuing Application is in condition for allowance. 



Respectfully submitted, 
Timothy J. Fredenburg 




Dated: (t>l ?l ?2 



by: 



J. Eppaliite, Reg. No. 30,266 
Carr & Ferrell, LLP 



2225 East Bayshore Road, Suite 200 
Palo Alto, CA 94303 



Telephone: (650) 812-3428 
Facsimile: (650) 812-3444 
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SYSTEM AND METHOD FOR USING A CORRESPONDENC E TABLE TO 
COMPRESS A PRONUNCIATION GUIDE 

BACKGROUND OF THE INVENTION 
5 1 . Field of the Invention 

This invention relates generally to data compression, and more 
particularly to a system and method using correspondence 
techniques to compress a pronunciation guide. 

10 2. Description of the Background Art 

Computer Random Access Memory (RAM) and disk space are 
becoming more available and affordable in desktop computer 
systems. A typical desktop computer system currently provides on 
the order of sixteen megabytes of RAM and one gigabyte of hard disk 

15 memory. This increasing availability allows programmers the 

freedom to create application programs and data files which occupy 
several megabytes of computer memory. However, minimizing the 
size of data files remains important for optimizing system 
performance and use of memory resources. 

20 To minimize storage requirements, programmers compress 

large data files. One type of large file is a pronunciation dictionary, 
which includes dictionary words for a language such as American 
English and dictionary phonemes (phonetic sounds) representing the 
pronunciation of each of the dictionary words. A typical 

25 uncompressed pronunciation dictionary occupies up to about ten 
megabytes of memory. 

Information such as a pronunciation dictionary can be 
compressed using certain symbols to replace redundant data. For 
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example, a typical compression technique assigns symbols to 
represent particular patterns of redundant data such as multiple 
zeros or ones. Multiple compression techniques may be performed 
successively to eliminate more redundancies and compress data 
5 further. Accordingly, a pronunciation dictionary may be compressed 
to around thirty percent or less of its original size. 

Previous techniques for compressing pronunciation dictionaries 
do not take into account redundancies inherent in dictionary words 
and dictionary phonemes. Therefore, as an addition to other 
10 techniques for compressing a pronunciation dictionary, it is desirable 
to have a system and method for taking advantage of redundancies 
in pronunciation. 

SUMMARY OF THE INVENTION 

15 The present invention overcomes limitations and deficiencies of 

previous systems by providing a new system and method for 
compressing a pronunciation guide such as a pronunciation 
dictionary. The system substitutes a single symbol for some text and 
its pronunciation, and includes a central processing unit (CPU) and 

20 memory. The memory stores a compression system including 
parsing routines, a correspondence table, a matching system, a 
decoder table and a decoder system. The parsing routines extract a 
dictionary entry, which comprises a dictionary word and 
corresponding dictionary phonemes representing the pronunciation 

25 of the dictionary word, from an uncompressed pronunciation 

dictionary also stored in the memory. The correspondence table is 
made up of correspondence sets, each of which has a text entry, a 
phoneme entry representing the pronunciation of the text entry, and 
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a set-identifying symbol (i.e., a number). The matching system 
attempts to find all correspondence sets that match text and 
phoneme combinations of the dictionary entry. 

If matches are found, then the matching engine selects the best 
5 matches and adds the representative correspondence symbol set to a 
compressed pronunciation dictionary. If a match is not found, then 
the matching system considers characters silent and/or phonemes 
unmatched, and assigns special symbols to be added to the 
compressed pronunciation dictionary. The matching system adds 

10 decoder code sets to a decoder table for translating the special 
symbols back to characters or phonemes. 

The decoder system uses the compressed pronunciation 
dictionary and decoder code sets to generate corresponding 
phonemes for selected text. These phonemes can be used in 

15 processes such as speech recognition, speech synthesis, language 
translation, foreign language learning, spell checking, etc. 

The present invention provides a method for compressing a 
pronunciation dictionary. The method creates a correspondence 
table comprised of correspondence sets, determines which 

20 correspondence sets match a dictionary word and its corresponding 
dictionary phonemes, and adds the correspondence symbols as 
compressed data entries to a compressed pronunciation dictionary. 
The invention also provides a method for using the compressed 
dictionary and decoder code sets to generate phonemes from input 

25 text. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 is a block diagram of a computer system including a 
compression system in accordance with the present invention; 
FIG. 2 is a block diagram showing dictionary compressing 
5 components of the FIG. 1 compression system used to construct a 
compressed pronunciation dictionary; 

FIG. 3 is a text-phoneme correspondence table for American 
English; 

FIG. 4 is a block diagram showing components of the FIG. 1 
10 compression system used in application of the compressed 
dictionary; 

FIG. 5 is a flowchart illustrating the preferred method for 
compressing a pronunciation dictionary and using the compressed 
pronunciation dictionary for decoding selected text; 
15 FIG. 6 is a flowchart further illustrating steps of the preferred 

method for compressing an entry from a pronunciation dictionary; 
and 

FIG. 7 is an example phoneme set for American English. 

20 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

FIG. 1 is a block diagram of a computer system 100 including a 
compression system 180 in accordance with the present invention. 
Computer system 100 is preferably based on a computer such as a 
Power Macintosh manufactured by Apple Computer, Inc. of 

25 Cupertino, California. Computer system 100 includes a Central 

Processing Unit (CPU) 110, an input device 120 such as a keyboard 
and mouse or scanner, and an output device 130 such as a Cathode 
Ray Tube (CRT) or audio speaker, a Random Access Memory (RAM) 
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150, a data storage (hard disk) 160, an operating system 170 and a 
compression system 180, each coupled to signal bus 140. 

Operating system 170 is a program that controls processing by 
CPU 110, and is typically stored in data storage 160 and loaded into 
5 RAM 150 during computer system initialization. CPU 110 has access 
to RAM 150 for storing intermediate results and miscellaneous data. 

Compression system 180 includes a dictionary compressing 
program 215 for compressing a pronunciation dictionary, and a 
decoder system program 420 for subsequently processing text and 
10 using the compressed pronunciation dictionary to retrieve phonemes 
representing the pronunciation of the text. Compression system 180 
O is also typically stored in data storage 160 and loaded into RAM 150 
H; prior to execution by CPU 110. 

FIG. 2 is a block diagram illustrating dictionary compressing 
~ program 215 of compression system 180, used with a pronunciation 
[J dictionary 210 to construct a compressed pronunciation dictionary 

270. Pronunciation dictionary 210 is preferably a conventional 
g compilation of dictionary words and of corresponding dictionary 
20 phonemes in a specified format expressing proper pronunciation of 
the dictionary words in, for example, American English. Suitable 
pronunciation dictionaries include the Oxford-American® Dictionary 
or the Random House® Dictionary. FIG. 7 illustrates an example 
phoneme list 700 for American English. List 700 includes thirty- 
25 eight phonemes and an example word which uses each phoneme. For 
example, the phoneme "AE" provides the sound made by the letter 
"a" as in the word "bat." Other phonemes or sound-representative 
symbols can alternatively be used. 
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Dictionary compressing program 215 includes parsing routines 
220, a data buffer 230, a correspondence table 240, a matching 
system 250 and a tuning function 260. Parsing routines 220 extract 
a dictionary entry, which includes a dictionary word and at least one 
5 dictionary phoneme representing the pronunciation of the word, 
from pronunciation dictionary 210. For the example word "enough", 
the extracted dictionary entry includes the dictionary word "enough" 
and the corresponding phonemes "IH n UX f". Parsing routines 220 
store the extracted dictionary entry in data buffer 230, which may 

10 be a portion of RAM 150 (FIG. 1). 

Correspondence table 240 lists phoneme entries and text 
entries, for example, as shown for American English in FIG. 3. A text 
entry, a phoneme entry and a symbol together form a 
correspondence set, and a plurality of correspondence sets forms 

15 correspondence table 240. Each correspondence set includes an 
identifier, referred to as a correspondence symbol, which may be 
simply the address of the set in correspondence table 240. 

Correspondence table 240 preferably includes correspondence 
sets for most practical combinations of correspondence text and 

20 phonemes in a given language. A correspondence table 240 which 
included every conceivable correspondence set would be inefficient 
because increasing the number of code sets degrades compression by 
subsequent compression techniques. Therefore, a tuning function 
260 facilitates eliminating the less useful correspondence sets from, 

25 and adding more useful correspondence sets to, correspondence table 
240. The utility or productivity of a correspondence set is 
determined by the number of dictionary entries it helps to compress. 
A pronunciation dictionary may be compressed a first time, and the 
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compressed dictionary examined to determine if any correspondence 
sets are used less than, say, five times. If so, the less used and thus 
unproductive correspondence sets can be eliminated or modified. 
Further, since phonemes typically have corresponding text, cases 
5 where a phoneme does not match any text may indicate a need to 
add a correspondence set. 

Matching system 250 is a program which reads the extracted 
dictionary entry from buffer 230, retrieves correspondence sets from 
correspondence table 240, and compares the dictionary entry with 

10 the correspondence sets. More particularly, matching system 250 
attempts to match the correspondence sets with combinations of 
phonemes and characters from the dictionary entry. If matches are 
made, matching system 250 assigns the correspondence symbol 
associated with the "best" matching correspondence set as a 

15 compressed data entry, as described below with reference to FIG. 6. 
If a match cannot be made for a particular dictionary character or 
phoneme, matching system 250 assigns, as compressed data entries, 
special symbols to represent silent characters or unmatched 
phonemes. The one or more compressed data entries representing 

20 an entire dictionary entry forms a "symbol set." The symbol sets for 
an entire pronunciation dictionary collectively form the "compressed 
pronunciation dictionary" 270. 

Matching system 250 further generates decoder code sets for 
de-compressing compressed pronunciation dictionary 270, and adds 

25 the code sets to a "decoder table" 280. Each decoder code set 
includes a decoder text entry, a corresponding decoder phoneme 
entry, and a decoder set-identifying symbol equivalent to a 
correspondence symbol of correspondence table 240. Decoder table 
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280 is like correspondence table 240 except that decoder table 280 
also includes decoder sets for the silent text characters and the 
unmatched phonemes. The decoder sets are described in more detail 
with reference to FIGs. 4 and 5. 

5 

FIG. 3 shows an example correspondence table 240 for 
American English. The first column specifies correspondence 
phoneme entries, the second column specifies correspondence text 
entries, and the third column specifies correspondence symbols. A 
10 correspondence text entry specifies text characters such as "e" or 
"ou," and is accompanied by typically only one phonetic sound. A 
y correspondence phoneme entry, such as "IH," is expressed in the 
H format used by pronunciation dictionary 210, for representing the 
tfl phonetic sound of each correspondence text entry. Since some text 
0315 entries produce multiple sounds, . a phoneme entry may represent 
7" multiple sounds such as "y UH." Further, there may be 
q correspondence entries which have multiple text characters and 
% multiple phonemes, like "y UW->ieu." 

% The correspondence sets may be organized into groups of rows 

20 of like phonemes. Grouping rows based on phonemes facilitates 
comparison with dictionary combinations if creating table 240 by 
hand. In the first row of table 240, correspondence phoneme "AE" 
represents one of the possible pronunciations of correspondence text 
entry "ai", and this correspondence set is represented by the symbol 
25 "(1)"- In the second row, the same correspondence phoneme "AE" 
represents one of the possible pronunciations of a different 
correspondence text entry, "a", and this correspondence set is 
represented by the symbol "(2)". In the third row, correspondence 
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phoneme "EY" represents another pronunciation of the same 
correspondence text entry "ai" in the first row, and this 
correspondence set is represented by symbol "(3)". These three rows 
illustrate how the same text entry may have different 
5 pronunciations, and different text entries may have the same 
pronunciation. 

Correspondence table 240 may be generated manually, i.e. by 
typing the table into a computer file, or generated electronically, i.e. 
by computer analysis of productive phoneme-text combinations. It 
10 will be appreciated that each language, such as American English or 
French, would use a different correspondence table 240. 

% FIG. 4 is a block diagram illustrating the decoder system 

jfl program 420 of compression system 180, and its input and output 
tf}15 data. Selected input text 410 may be stored in data storage 160 and 
loaded into RAM 150 for examination. Decoder system program 420 
q receives a word from selected input text 410. 
% Decoder system 420 uses the decoder table 280 codes to 

gj translate symbol sets of compressed pronunciation dictionary 270 in 
20 searching for the compressed dictionary word whose text matches 
the received word, and then in producing phonemes for the received 
word. If the dictionary compressing method compressed 
pronunciation dictionary 210 entries in the original alphabetical 
order of the dictionary words, then the symbol sets are entered in 
25 the same alphabetical order in compressed pronunciation dictionary 
270. Thus, decoder system 420 could approximate the location of the 
dictionary word which matches the input text word. Another 
embodiment of the dictionary compressing method provides an index 
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to compressed pronunciation dictionary 270. Further, any technique 
for searching a compressed file, such as a hashing function, may be 
used. 

Upon matching a compressed dictionary word to the input text 
word, decoder system 420 uses the decoder table 280 codes to 
retrieve dictionary phonemes 430 from the matching symbol set. 
Alternatively, as it searches the compressed dictionary and converts 
symbol sets to find a dictionary word which matches the received 
text, decoder system 420 may also convert the symbol sets to 
produce phonemes at the same time. 

For example, decoder system 420 receives the word "enough" 
from selected text 410. Decoder system 420 uses decoder table 280 
to decode symbol sets from compressed pronunciation dictionary 270 
until decoding a symbol set to match the dictionary word "enough". 
Upon finding a match, decoder system 420 uses decoder table 280 to 
translate the symbol set into the output data phonemes "IH n UX f * 
representing the pronunciation of the received text. 

FIG. 5 is a flowchart illustrating a method 500 for compressing 
pronunciation dictionary 210 and for using the compressed 
dictionary to generate representative phonemes from selected input 
text 410. Method 500 begins in step 510 by creating a 
correspondence table 240 for a given language. Creating 
correspondence table 240 comprises the step of inputting a number 
of correspondence sets, each of which includes a phoneme entry 
expressing at least one phonetic sound and a text entry which 
indicates' the phonetic sound or sounds, and inputting a 
correspondence set identifying symbol. Step 510 preferably includes 
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inputting a correspondence set for each of the various text 
representations of all of the phonemes used in pronunciation 
dictionary 210. 

The step 510 preferably further includes tuning function 260 
5 (FIG. 2) using the current version of correspondence table 240 to 
compress at least a portion of pronunciation dictionary 210 for 
determining which correspondence sets are unproductive and what 
other correspondence sets may be valuable if added. Tuning 
function 260 may be re-applied to optimize correspondence table 

10 240, thereby enabling matching system 250 to more effectively 
compress pronunciation dictionary 210 and enabling compressed 
pronunciation dictionary 270 to be further compressed by 
subsequent compression techniques. 

Program 215 in step 520 uses the optimized correspondence 

15 table 240 to compress pronunciation dictionary 210. More 
particularly, parsing routines 220 extract a dictionary entry 
including a dictionary word and corresponding dictionary phonemes 
from pronunciation dictionary 210, and store the dictionary entry in 
data buffer 230. 

20 Matching system 250 selects a first phoneme from the 

dictionary entry, and retrieves all correspondence sets from 
correspondence table 240 which start with the selected dictionary 
phoneme to determine if a match can be made. Multiple dictionary 
characters which together constitute a correspondence text entry in 

25 correspondence table 240 are "related." Divisions between related 
dictionary characters are typically harder to determine than 
divisions between dictionary phonemes. Also, there are fewer 
dictionary phonemes without corresponding dictionary characters 
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(e.g., as in abbreviations such as "Mrs." or "etc.") than there are 
"silent" dictionary characters without phonemes. Therefore, 
matching system 250 preferably selects a dictionary phoneme, and 
attempts to match correspondence sets based on the dictionary 
5 phoneme. 

Matching system 250 compares the correspondence sets 
retrieved from correspondence table 240 with the dictionary entry 
to determine if any matches can be made. If only one match is 
made, matching system 250 selects the correspondence symbol 

10 associated with the ' matching correspondence set as the compressed 
data entry for compressed pronunciation dictionary 270. If more 
than one match can be made, matching system 250 selects as the 
compressed data entry for compressed pronunciation dictionary 270 
the symbol for the correspondence set corresponding to the best 

15 match. If no match can be made, matching system 250 generates 
special symbols to represent "silent" characters, or conversely 
generates special symbols to represent phonemes unmatched to 
dictionary text. Generation of special symbols is described in greater 
detail with reference to FIG. 6. If a special symbol is generated, a 

20 decoder code set representing the association of the special symbol to 
the silent character or alternatively to the unmatched phoneme is 
added to decoder table 280 for subsequently decoding the special 
symbol. 

Matching system 250 then selects the next unprocessed 
25 phoneme, and repeats step 520. until the compressed data entries 
have been generated for the entire dictionary entry. Examples of 
this process are described with reference to Examples 1-3. After all 
the pronunciation dictionary 210 entries have been compressed, the 
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symbol set, which possibly includes special symbols, is added to 
compressed pronunciation dictionary 270. It will be appreciated that 
step 510 and step 520 are typically performed by a product 
developer. 

5 Decoder system 420 in step 530 uses compressed 

pronunciation dictionary 270 and decoder table 280 to generate 
phonemes for selected text 410. Decoder system 420 receives a 
selected word from text 410, and then uses decoder table 280 to 
decode symbol sets from compressed pronunciation dictionary 270 

10 until one of the decoded dictionary words matches the first input 
word. Decoder system 420 next uses decoder table 280 to retrieve 
the dictionary phonemes from the matching symbol set, and then 
method 500 ends for the first input word. Step 530 repeats for 
subsequently received words. It will be appreciated that step 530 is 

15 typically performed by a customer. 

FIG. 6 is a flowchart illustrating a preferred method 600 for 
compressing an entry from pronunciation dictionary 210. Method 
600 is repeated for every word in the. dictionary to accomplish FIG. 5 

20 step 520. Method 600 begins in step 605 by matching system 250 
reading a dictionary entry, which comprises a dictionary word and a 
dictionary phoneme entry representing the pronunciation of the 
dictionary word, from buffer 230. Matching system 250 in step 610 
determines whether any dictionary characters or dictionary 

25 phonemes remain unprocessed in the dictionary entry. If not, 
method 600 ends. Otherwise, matching system 250 in step 620 
determines whether both a dictionary character and a dictionary 
phoneme remain. 
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If both remain, matching system 250 in step 630 searches 
correspondence table 240 for all correspondence sets that match 
dictionary phoneme-character combinations of the remaining 
portions of the dictionary entry. More particularly, selecting the next 
5 currently-unmatched phoneme, matching system 250 retrieves all 
correspondence sets which begin with the selected dictionary 
phoneme. Matching system 250 then compares these 
correspondence sets against the unmatched portions of the dictionary 
entry. 

10 If matching system 250 in step 640 finds at least one match, 

then matching system 250 in step 650 selects the best match, assigns 
and stores symbols for any pending silent dictionary characters, and 
stores the correspondence symbol for the selected matching 
correspondence set. Method 600 then returns to step 610. 

15 To select the best match, matching system 250 first selects 

from the matching sets as the tentative choice the correspondence 
set having the most phonemes. If there is more than one set having 
the most phonemes, then matching system 250 selects as the 
tentative choice the set that has the most phonemes and the most 

20 text characters. If there are more than one of these sets, matching 
system 250 just selects the first of them. The tentative choice is the 
best match unless matching system 250 determines one of the other 
sets satisfies selected criteria, suggesting that it is a better choice. 
The criteria include: 

25 (1) the other correspondence set is shorter than the current 

tentative choice, i.e., it has fewer phonemes 'or has the same 
number of phonemes and fewer text characters; 
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(2) at least one unprocessed dictionary phoneme would remain 
in the dictionary entry after the non-tentative set is applied; 
and 

(3) there is a correspondence set that matches the next 

5 unprocessed dictionary phonemes and dictionary characters 

that would remain if the non-tentative set were to be applied. 
If another set meets the above criteria, it becomes the tentative 
choice. The process repeats until all other sets have been tested. 
Method 600 then returns to step 610. 

10 If in step 640 no matches are found, matching system 250 in 

step 685 determines whether a threshold number of dictionary 
characters are currently assumed silent. If not, matching system 250 
in step 690 considers the next dictionary character as silent, and 
returns to step 610. If in step 685 a set threshold number of silent 

15 characters are pending, matching . system 250 in step 695 assigns and 
stores a special symbol for the current phoneme and considers 
pending silent dictionary characters as no longer silent, i.e. re-labels 
the pending silent characters as unprocessed. -Method 600 then 
returns to step 610. 

20 If in step 620 matching system 250 determines that there are 

not both a dictionary character and a phoneme remaining in the 
dictionary entry, then matching system 250 in step 660 determines 
whether it is characters or phonemes that remain. If characters 
remain, matching system 250 in step 670 assigns and stores special 

25 symbols for all pending silent and all remaining dictionary 

characters. If only phonemes remain, system 250 proceeds to step 
695 and continues as explained above. 
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10 



Example 1: "IH n UX f' and "enough" 

Matching system 250 retrieves the dictionary word "enough" 
and the dictionary phonemes "IH n UX f" from data buffer 230. 
Matching system 250 then selects the first dictionary phoneme "IH," 
retrieves from exemplary table 240 (FIG. 3) all correspondence sets 
which begin with the selected phoneme (IH-»a, IH-»e, IH-»i, IH^o, 
IH->u, IH->y,), and determines if any of the correspondence sets 
match. 

IH n UX f enough 



Matching system 250 finds only one match (IH-»e) and accordingly 

rj "remembers," i.e. stores in memory, the symbol "(43)" representing 

;f the match. 

u Matching system 250 then selects the next unprocessed 

fll5 phoneme "n" and retrieves the correspondence sets (ny-»gn, n-^en, 

r n->gn, n^kn, n->nn, n->n). 

Z IHnUXf e nough 



A A 



I Matching system 250 finds only match (n->n), and remembers the 
20 symbol "(161)" representing the only match. 

Matching system 250 selects the third phoneme "UX" and 

retrieves the correspondence sets (UXr->r, UX-»a, UX->eu, UX^e, 

UX-H UX-»ou, UX->o, UX-»u, UX-»y). 

IH n UX f en ough 

25 a A 

Matching system 250 finds two matches (UX-»oand UX->ou), and 
selects the better match. Since UX-»ou has more text characters 
matching system 250 selects it as the tentative best match. Matching 
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system 250 applies the three-criteria test as described with 
reference to FIG. 6 to confirm the best match assumption. The other 
set UX-> o has fewer text characters, and a phoneme remains after it 
is applied. However, a match cannot be made with the next 
5 currently unmatched phoneme "f" and remaining text characters 
"ugh." Thus, matching system 250 selects UX->ou as the best match, 
and accordingly remembers the correspondence symbol "(89)." 

Matching system 250 selects the next unprocessed phoneme "f" 
and retrieves the correspondence sets (f-Mf, f->f, f-^gh, f->ph). 
10 IHnUXf enou gh 

A A 

Matching system 250 finds only one match (f->gh) and thus 
remembers "123" as representing the match. Accordingly, matching 
system 250 stores the symbol set "43 161 89 123" in compressed 
15 pronunciation dictionary 170 as the compressed data entry 
representing the dictionary word "enough" and its dictionary 
phonemes "IH n UX f \ 

Example 2: "AE n s UX r" and "answer" 

20 Matching system 250 retrieves the dictionary word "answer" 

and the dictionary phonemes "AE n s UX r". In a manner similar to 
that described in Example 1, matching system 250 matches the 
dictionary combination AE->a and represents it by correspondence 
symbol "2," matches the dictionary combination n->n and represents 

25 it by symbol "161" and matches the dictionary combination s->s and 
represents it by symbol "173." At this time, matching system 250 
selects dictionary phoneme "UX" and retrieves the correspondence 
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sets (UXr->r, UX^a, UX^eu, UX^e, UX^i, UX->ou, UX^o, UX^u, 
UX^y). 

AE n s UX r ans wer 

A A 

5 Matching system 250 finds no match. Thus, matching system 250 
assumes that "w" is silent. 

With the "w" silent, matching system 250 examines the 
correspondence sets with the remaining unprocessed dictionary 
entry. 

10 AE n s UX r answ er 

A A 

Matching system 250 finds a match (UX->e), and thus assigns and 
stores a special symbol such as "221" to represent the silent 
dictionary character "w" and remembers the symbol "(87)." Further, 
15 matching system 250 adds the decoder code set, for example "221 w 
0" wherein the empty set represents no phoneme, to decoder table 
280. 

Lastly, matching system 250 selects the dictionary phoneme "r" 
and retrieves the correspondence sets (r->rr, r->er, rr->r, r->r). 
20 AE n s UX r answe r 

A A 

Matching system 250 finds only one match (r->r), and selects the 
symbol "(155)." Matching system 250 adds "2 161 173 221 87 155" 
to compressed pronunciation dictionary 270 as a symbol set 
25 representing the word "answer" and the phonemes "AE n s UX r." 
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F.xam ple 3: "r IH D AX m" and "rhvthm" 

Matching system 250 retrieves the dictionary word "rhythm" 
and the dictionary phonemes "r IH D AX m," selects the first 
dictionary phoneme "r" and retrieves the correspondence sets (r->rr, 
5 r->er, rr->r, r->r). 

r IH D AX m rhythm 

A A 

Matching system 250 finds only one match (r->r), and remembers 
the symbol "169." 
10 Matching system 250 selects the next unprocessed phoneme 

"IH" and retrieves the correspondence sets (IH-^a, IH->e, IH->i, IH->o, 
IH->u, IH->y). 

r IH D AX m r hythm 

A A 

15 Matching system 250 finds no matches. Accordingly, matching 

system 250 assumes the "h" is silent. With the "h" silent, matching 
system 250 then examines the remaining portions of the dictionary 
entry. 

r IH D AX m rh ythm 

20 A a 

Matching system 250 finds a match (IH->y), assigns a special symbol 
such as "222" for silent "h" and remembers the symbol "47." 

Matching system 250 then selects the next unprocessed 
phoneme "D" and retrieves only correspondence set (D->th), since in 
25 this example matching system 250 is case sensitive. 

r IH D AX m rhy thm 
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Matching system 250 finds a match and remembers the symbol 
"120." 

Matching system 250 then selects the next unprocessed 
phoneme "AX" and retrieves the correspondence sets (AXk->c, AX1->1, 
5 AXm->m, AX->a, AX^e, AX->ia, AX-H AX->o, AX-»u, AX->y, AX-»'). 

r IH D AX m rhyth m 

A A 

Matching system 250 finds only one match (AXm->m), and 
remembers symbol "(16)." Since no other characters exist, matching 

10 system 250 adds "169 222 47 120 16" to compressed pronunciation 
dictionary 270 as a symbol set representing the dictionary word 
"rhythm" and the corresponding phonemes "r IH D AX m." 

If for example the correspondence set AXm->m was not 
included in correspondence table 240, matching system 250 would 

15 find no match. Accordingly, matching system 250 would assume the 
text character "m" is silent. Since only characters would remain, 
matching system 250 would emit a special symbol such as "223" for 
current phoneme "AX" and would consider the text character "m" is 
no longer silent. Matching system 250 would then retrieve the 

20 correspondence sets (m->lm, m->mm, m-^m) for phoneme "m", would 
find the only match m->m, and would remember the symbol "155." 
Since no other characters would exist, matching system 250 would 
add "169 222 47 120 223 155" to compressed pronunciation 
dictionary 270 as a symbol set representing the dictionary word 

25 "rhythm" and the corresponding phonemes "r IH D AX m." 

The present invention advantageously provides a system and 
method for compressing a pronunciation dictionary. This is 
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especially useful, for example, as a precursor to other compression 
techniques. The system and method take advantage of the natural 
redundancy between dictionary text and dictionary phonemes. Since 
compression system 180 substitutes symbols for sets of dictionary 
5 words and phonemes, memory required to store the information is 
reduced by approximately one-third to one-half. 

For example, each character in a word may be represented by 
five bits (since there are twenty-six letters in the English alphabet), 
and each phoneme may be represented by six bits (since there are 

10 about thirty-nine phonemes for American English as illustrated in 
FIG. 7). Further, dictionary words and the set of phonemes for each 
dictionary word are divided by a terminator character. The word 
"enough" requires seven characters (including the terminator 
character) and thus occupies thirty-five bits. The corresponding 

15 phoneme set "IH n UX f ' requires five characters (including the 

terminator character), and thus occupies thirty bits. Thus, the total 
memory for storing this dictionary entry is sixty-five bits. 

A decoder table 280, as shown in FIG. 3, has about 220 
correspondence sets and 220 correspondence symbols. Accordingly, 

20 eight bits are needed to represent a correspondence symbol. As 
illustrated in the first example, the four symbols "(43)", "(161)", 
"(89)" and "(123)" represent the word "enough" and phonemes "IH n 
UX f". Thus, five symbols (including the terminator character) are 
needed and occupy forty bits. Forty bits provides a thirty-eight 

25 percent savings over the uncompressed sixty-five bits. 

The foregoing description of the preferred embodiments of the 
invention is by way of example only, and other variations are 
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provided by the present invention. For example, components of this 
invention may be implemented using a programmed general purpose 
digital computer, using application specific integrated circuits, or 
using a network of interconnected conventional components and 
5 circuits. Further, although the invention has been described with 
reference to a dictionary, any guide having text and phonemes can 
be compressed using the system and method of the present 
invention. Still further, although the invention has been described 
using phonemes, other alternative means for representing 

10 pronunciation of text are possible, such as allophones, syllables or 
symbols generated by an earlier compression system. The 
embodiments described herein are presented for purposes of 
illustration and are not intended to be exhaustive or limiting. Many 
variations and modifications are possible in light of the foregoing 

15 teaching. The system is limited only by the following claims. 
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WHAT IS CLAIMED IS : 

1 L A system for compressing a pronunciation guide which includes 

2 a plurality of guide entries, each entry having a guide word and at 

3 least one associated phoneme representing the pronunciation of the 

4 word, the system comprising: 



5 memory storing 

6 (1) a correspondence table which includes a plurality of 

7 correspondence sets, each set having 

8 (i) a text entry, 

9 (ii) a phoneme entry representing a 

10 pronunciation of the text entry, and 

11 (iii) a symbol identifying the correspondence set; 

12 and 

13 (2) - a matching system for comparing a selected guide 

14 word and the associated phonemes with correspondence sets, 

15 and storing correspondence symbols which represent matching 

16 correspondence sets as a compressed pronunciation guide entry 

17 in the memory; and 

18 a processing unit coupled to the memory for controlling the 



19 operations of the matching system. 

1 2. The system of claim 1 wherein the correspondence table 

2 includes correspondence sets for productive combinations of 

3 phonemes and text in a particular language. 
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1 3. The system of claim 1 wherein the correspondence symbols are 

2 numbers, each representing the position of the respective 

3 correspondence set in the correspondence table. 

1 4. The system of claim 1 wherein the memory further stores a 

2 tuning function which enables deletion of unproductive 

3 correspondence sets from the correspondence table. 

1 5. The system of claim 1 wherein the matching system 

2 compares various correspondence sets, and if several matches 

3 are made selects the best match. 

1 6. The system of claim 1 wherein the matching system generates 

2 a special symbol representing a silent character, and stores the 

3 special symbol as part of a compressed pronunciation guide entry in 

4 the memory. 

1 7. The system of claim 1 wherein the matching system generates 

2 a special symbol representing a phoneme without any corresponding 

3 characters, and stores the special symbol as part of a compressed 

4 pronunciation guide entry in the memory. 

1 8. The system of claim 1 wherein the matching system generates 

2 a decoder table comprising decoder code sets for use in subsequently 

3 de-compressing compressed pronunciation guide entries. 

1 9. The system of claim 8 wherein the decoder code sets replicate 

2 a portion of the correspondence table. 
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1 10, The system of claim 8 wherein the decoder code sets include 

2 symbols representing silent text. 

1 11. The system of claim 8 wherein the decoder code sets include 

2 symbols representing phonemes without corresponding characters. 

1 12. The system of claim 1 wherein the matching system selects 

2 correspondence sets from the correspondence table for comparison 

3 with characters and phonemes from the guide entry. 

1 13. The system of claim 1 wherein the pronunciation guide 

2 includes a pronunciation dictionary. 

1 14. A system for using a compressed pronunciation guide and 

2 decoder table to decode selected text, comprising: 



3 memory storing 

4 (1) a compressed pronunciation guide having a 

5 plurality of symbol sets, each symbol set representing a guide 

6 word and at least one corresponding guide phoneme 

7 representing the pronunciation of the guide word, 

8 (2) a decoder table having a plurality of decoder code 

9 sets for translating symbol sets, each decoder code set 

10 including a decoder text entry, a decoder phoneme entry and a 

11 decoder symbol representing the decoder code set; 

12 (3) a decoder system for using the decoder table to 

13 translate symbol sets to find a guide word which matches the 

14 selected text, and upon finding a match using the decoder table 
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15 to retrieve the decoder phonemes from the matching symbol 

16 set; and 

17 a processor coupled to the memory for controlling the 

18 operations of the decoder system. 

1 15. The system of claim 14 wherein the decoder code sets include 

2 symbols representing silent text. 

1 16. The system of claim 14 wherein the decoder code sets include 

2 symbols representing phonemes without corresponding characters. 

4 1 17. A computer-based method for compressing a pronunciation 

H 2 guide which includes a plurality of guide entries, each entry having a 

*0 3 guide word and at least one associated guide phoneme representing 

m 4 the pronunciation of the guide word, comprising the steps of: 

7 5 providing a computer memory; 

U 6 storing in a first portion of the computer memory a 

7 correspondence table which includes a plurality of correspondence 

8 sets, each correspondence set including a correspondence text entry, 

9 a correspondence phoneme entry representing a pronunciation of the 

10 correspondence text entry and a unique correspondence symbol 

11 identifying the correspondence set; 

12 receiving a guide word and at least one guide phoneme 

13 representing the pronunciation of the guide word; 

14 comparing the guide word and guide phonemes with 

15 correspondence sets; and 
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16 storing the correspondence symbols representing matching 

17 correspondence sets as compressed pronunciation guide entries in a 

18 second portion of the computer memory. 

1 18. The method of claim 17 wherein the correspondence table 

2 includes correspondence sets for productive combinations of 

3 phonemes and text in a particular language. 

1 19. The method of claim 17 wherein the correspondence symbol is 

2 a number representing the position of the correspondence set in the 

3 correspondence table. 

1 20. The method of claim 17 further comprising, after the step of 

2 storing in a first portion and before the step of receiving, the step of 

3 deleting unproductive correspondence sets from the correspondence 

4 table. 

1 21. The method of claim 17 wherein the step of comparing further 

2 comprises: 

3 selecting a next currently-unmatched guide phoneme from the 

4 guide entry; 

5 retrieving all correspondence sets from the correspondence 

6 table which begin with the selected guide phoneme; and 

7 comparing the retrieved correspondence sets with the 

8 remaining portions of the guide entry. 
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1 22. The method of claim 21 wherein the step of comparing 

2 further comprises examining several correspondence sets, and 

3 if multiple matches are made selecting the best match. 

1 23. The method of claim 22 wherein the step of comparing further 

2 comprises, if a match is not made, generating a special symbol which 

3 represents a silent guide character, and storing the special symbol in 

4 the memory as part of a compressed pronunciation guide entry. 

1 24. The method of claim 22 wherein the step of comparing further 

2 comprises, if a match is not made, generating a special symbol which 

3 represents a phoneme without any corresponding guide characters, 

4 and storing the special symbol in the memory as part of a 

5 compressed pronunciation guide entry. 

1 25. The method of claim 17 and further comprising the step of 

2 generating a decoder table including decoder code sets for de- 

3 compressing the compressed pronunciation guide entries. 

1 26. The method of claim 25 wherein the decoder code sets 

2 replicate a portion of the correspondence table. 

1 27. The method of claim 25 wherein the decoder code sets include 

2 symbols representing silent text. 

1 28. The method of claim 25 wherein the decoder code sets include 

2 symbols representing phonemes without corresponding guide 

3 characters. 
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1 29. The method of claim 17 further comprising, after the step of 

2 storing in a second portion, the step of using the compressed 

3 pronunciation guide to generate phonemes representing the selected 

4 text. 

1 30. A computer-based method for using a compressed 

2 pronunciation guide and a decoder table to retrieve phonemes for 

3 selected text, comprising the steps of: 



4 providing computer memory; 

5 storing in a first portion of the computer memory a compressed 

6 pronunciation guide which includes a plurality of symbol sets, each 

7 symbol set representing a guide word and at least one guide 

8 phoneme representing the pronunciation of the guide word; 

9 storing in a second portion of the computer memory a decoder 

10 table which includes a plurality of decoder sets, each decoder set 

11 having a decoder text entry, a decoder phoneme entry representing 

12 the pronunciation of the decoder text entry, and a unique decoder set 

13 identifying symbol; 

14 receiving selected text; 

15 using the decoder table to decode a symbol set in the 

16 pronunciation guide to produce a guide word; 

17 comparing the selected text with the guide word to determine 

18 if they match; and 

19 if a match is made, using the decoder table to retrieve the 

20 guide phonemes corresponding to a matching symbol set. 
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1 3 1. A computer storage medium storing a computer program for 

2 causing a computer to perform the steps of: 

3 allocating computer memory; 

4 storing in a first portion of the computer memory a compressed 

5 pronunciation guide which includes a plurality of symbol sets, each 

6 symbol set representing a guide word and guide phonemes 

7 representing the pronunciation of the guide word; 

8 storing in a second portion of the computer memory a decoder 

9 table which includes a plurality of decoder sets, each decoder set 

10 having a decoder text entry, a decoder phoneme entry representing 

11 the pronunciation of the decoder text entry, and a unique decoder set 

12 identifying symbol; 

13 receiving selected text; 

14 using the decoder table to decode a symbol set in the 

15 pronunciation guide to produce a guide word; 

16 comparing the selected text with the guide word to determine 

17 if they match; and 

18 if a match is made, using the decoder table to retrieve the 

19 guide phonemes corresponding to a matching symbol set. 

1 32. A computer storage medium storing a computer program for 

2 causing a computer to perform the steps of: 

3 allocating computer memory; 

4 storing in a first portion of the computer memory a 

5 correspondence table which includes a plurality of correspondence 

6 sets, each correspondence set including a correspondence text entry, 

7 a correspondence phoneme entry representing the pronunciation of 
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8 the correspondence text entry and a unique correspondence symbol 

9 identifying each correspondence set; 

10 receiving a guide word and at least one guide phoneme 

11 representing the pronunciation of the guide word; 

12 comparing the guide word and guide phonemes with 

13 correspondence sets; and 

14 storing the correspondence symbols representing matching 

15 correspondence sets as compressed pronunciation guide entries, in a 

16 second portion of the computer memory. 

1 33. A computer-based system for compressing a pronunciation 

2 guide, which includes a guide word and at least one guide phoneme 

3 representing the pronunciation of the guide word, comprising: 

4 computer memory; 

5 means for storing in a first portion of the computer memory a 

6 correspondence table which includes a plurality of correspondence 

7 sets, each correspondence set including a correspondence text entry, 

8 a correspondence phoneme entry representing the pronunciation of 

9 the correspondence text entry, and a unique correspondence symbol 

10 identifying the correspondence set; 

11 means for receiving a guide word and at least one guide 

12 phoneme representing the pronunciation of the guide word; 

13 means for comparing the guide word and guide phonemes with 

14 correspondence sets; and 

15 means for storing the correspondence symbols representing 

16 matching correspondence sets as a compressed pronunciation guide 

17 entry in a second portion of the computer memory. 
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1 34. A computer data storage medium storing a correspondence 

2 table which enables compression of a pronunciation dictionary, the 

3 correspondence "table comprising a plurality of correspondence sets, 

4 each correspondence set including a correspondence text entry and a 

5 correspondence phoneme entry representing the pronunciation of 

6 the correspondence text entry, and a correspondence symbol 

7 identifying the correspondence set. 



-32- 



PATENT 



SYSTEM AND METHOD FOR USING A CORRESPONDENCE TABLE TO 
COMPRESS A PRONUNCIATION GUIDE 

ABSTRACT OF THE DISCLOSURE 
5 Parsing routines extract from a conventional pronunciation 

dictionary an entry, which includes a dictionary word and dictionary 
phonemes representing the pronunciation of the dictionary word. A 
correspondence table is used to compress the pronunciation 
dictionary. The correspondence table includes correspondence sets 

10 for a particular language, each set having a correspondence text 
entry, a correspondence phoneme entry representing the 
pronunciation of the correspondence text entry and a unique 
correspondence set identifying symbol. A matching system 
compares a dictionary entry with the correspondence sets, and 

15 replaces the dictionary entry with the symbols representing the best 
matches. In the absence of a match, symbols representing silent text 
or unmatched phonemes can be used. The correspondence symbols 
representing the best matches provide compressed pronunciation 
dictionary entries. The matching system also generates decoder code 

20 sets for subsequently translating the symbol sets. A decoder system 
uses the decoder code sets for translating symbol sets in the 
compressed pronunciation dictionary to generate phonemes 
corresponding to selected text. 
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FIG. 3: A Correspondence Table for American English 
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Additionally, the symbol "1" is used to indicate primary stress, 
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